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Suppose two Hermitian matrices A,B almost commute (|| [A, i?]|| < 8). Are they close to a 
commuting pair of Hermitian matrices, A' ,B', with \\A — j4'||,||B — B'\\ < e? A theorem of H. 
Lin[3] shows that this is uniformly true, in that for every e > there exists a S > 0, independent 
of the size N of the matrices, for which almost commuting implies being close to a commuting 
pair. However, this theorem does not specify how 5 depends on e. We give uniform bounds relating 
S and e. The proof is constructive, giving an explicit algorithm to construct A' and B' . We 
provide tighter bounds in the case of block tridiagonal and tridiagonal matrices. Within the context 
of quantum measurement, this implies an algorithm to construct a basis in which we can make 
a projective measurement that approximately measures two approximately commuting operators 
simultaneously. Finally, we comment briefly on the case of approximately measuring three or more 
approximately commuting operators using POVMs (positive operator-valued measures) instead of 
projective measurements. 

The problem of when two almost commuting matrices are close to matrices which exactly commute, or, equivalently, 
when a matrix which is close to normal is close to a normal matrix, has a long history. See, for example ^ [3], and 
other references in [3] where it is mentioned that the problem dates back to the 1950s or earlier. Finally in 1995, 
Lin[3] proved that for any e > 0, there is a S > such that for all N, for any pair of Hermitian N-by-N matrices, A, B, 
with || ||B|| < 1, and ||[A,S]|| < 5, there exists a pair A',B' with [A',B'] = and \\A- A'\\ < e and \\B - B'\\ < e 
This proof was later shortened and generalized by Friis and Rordam[3]. Interestingly, the same is not true for almost 
commuting unitary matrices[5j or for almost commuting triplets[6l [7]. 

The importance of the above results is that the bound is uniform in N. That is, S depends only on e. Unfortunately, 
the proofs do not give any bounds on how S depends on e. Further, the proofs of Lin and Friis and Rordam are 
nonconstructive, so there is no known way to find the matrices A' and B' . In this paper, we present a construction 
of matrices A' and B' which enables us to give lower bounds on how small S must be to obtain a given error e. 

Specifically, we prove that 

Theorem 1. Let A and B be Hermitian, N-by-N matrices, with \\A\\, \\B\\ < 1. Suppose \\[A, B]\\ < 5. Then, there 
exist Hermitian, N-by-N matrices A' and B' such that 

1: [A',B'] =0. 

2; \\A' - A\\ < e(S) and \\B' - B\\ < e(5), with 

e(6) = E(1/S)S 1 / 5 , (1) 
where the function E(x) grows slower than any power of x. The function E{x) does not depend on N . 

Throughout this paper, we use ||...|| to denote the operator norm of a matrix, and |....| to denote the P-norm of a 
vector. 

The proof of theorem ([T]) involves first constructing a related problem involving a block tridiagonal matrix, H, 
and a block identity matrix X (we use the term "block identity matrix" to refer to a block diagonal matrix that is 
proportional to the identity matrix in each block). For such matrices we prove the theorem 

Theorem 2. Let X be a block identity Hermitian matrix and let H be a block tridiagonal Hermitian matrix, with the 
j-th block of X equal to c + jA times the identity matrix, for some constants c and A. Let \\H\\, \\X\\ < 1. Then, 
there exist Hermitian matrices A' and B' such that 

1; [A',B'\ =0. 

2: \\A' - H\\ < e'(A) and \\B' - X\\ < e'(A), with 

e'(A) = S'(1/A)A 1 / 4 , (2) 

where the function E'(x) grows slower than any power of x. The function E'(x) does not depend on the dimension 
of the matrices. 

After proving these results, we prove a tighter bound in the case where H is a tridiagonal matrix, rather than a 
block tridiagonal matrix: 
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Theorem 3. Let X be a diagonal Hermitian matrix and let H be a tridiagonal Hermitian matrix, with the j-th 
diagonal entry of X egual to c + jA, for some constants c and A. Let \\H\\,\\X\\ < 1. Then, there exist Hermitian 
matrices A' and B' such that 

1: [A',B'\ =0. 

2; \\A' - H\\ < e'(A) and \\B' - X\\ < e'(A), with 

e'(A) = J E"(1/A)A 1 / 2 , (3) 

where the function E"(x) grows slower than any power of x. The function E"(x) does not depend on the 
dimension of the matrices. 

The proofs rely heavily on ideas relating to Lieb- Robinson bounds(8"l-lll]. These bounds, combined with appropri- 
ately chosen filter functions, have been used in recent years in Hamiltonian complexity to study the dynamics and 
ground states of quantum systems, obtaining results such as a higher dimensional Lieb-Schultz-Mattis theorem[9], 
a proof of exponential decay of correlations [T3], studies of dynamics out of equilibrium [T5HI7], new algorithms for 
simulation of quantum systems 18-22 , an area law for entanglement entropy for general interacting svstems|23|. study 
of harmonic lattice systems [23], a Goldstone theorem with fewer assumption [25J, and many others. The present paper 
represents a different application, to the study of almost commuting matrices. 

Before beginning the proof, we give some discussion of physics intuition behind the result. The next few paragraphs 
are purely to motivate the problems from a physics viewpoint. In the last section on quantum measurement and 
in the discussion at the end we give additional applications to quantum measurement and construction of Wannicr 
functions. The section on quantum measurement is intended to be self-contained. As mentioned, we begin by relating 
this problem to the study of block tridiagonal matrices. We then interpret the matrix If as a Hamiltonian for a 
single particle moving in one dimension, and apply the Lieb-Robinson bounds. The result ^ implies that we can 
construct a complete orthonormal basis of states which are simultaneously localized in both position (X) and energy 
(H). It is certainly easy to construct an overcomplete basis of states which is localized in both position and energy, 
by considering, for example, Gaussian wavepackets. The interesting result is the ability to construct an orthonormal 
basis which satisfies this. 

Additional physics intuition can be obtained by considering the case where H is a tridiagonal matrix with on 
the diagonal and elements just above and below the diagonal equal to 1, and where X is a diagonal matrix with 
entries 1/N,2/N, .... We refer to this as a uniform chain. In the uniform chain case, if we define a new matrix H' 
by randomly perturbing H, replacing each diagonal element of H with a small diagonal number chosen at random, 
the eigenvectors of H' are localized with high probability 26, 27]. Then, we can construct a matrix X' which exactly 
commutes with H' as follows: if v is an eigenvector of H', we choose it to have eigenvalue for X' equal to (v, Xv). 
Then, since the eigenvectors are localized, we find that \\X — X'\\ is small. The difference \\X — X'\\ depends on the 
localization length which depends inversely on the amount of disorder, while the difference \\H — H'\\ depends on the 
amount of disorder. Unfortunately, we do not have a good enough understanding of the effect of disorder for matrices 
H which are block tridiagonal, rather than just tridiagonal, to turn this approach into a proof for general H and X, 
and thus we rely on an alternative, constructive approach. 



I. PROOF OF MAIN THEOREM 



We now outline the proof of theorem ([T]). The proof is constructive, and is described by the following algorithm: 
1: Construct H from A as described in section (II A I and lemma 0. We will bound \\H — A\\. 



2: Construct X from B as described in section (II B). We will bound \\X — B\\. In a basis of eigenvalues of X, the 
matrix H will be block tridiagonal. 



3: Construct a new basis as described in section (III I such that in this basis H is close to a block diagonal matrix. 
That is, we will bound the operator norm of the block off-diagonal part of H. The blocks will be different from 
the blocks considered in step (2) above and will be larger. Further, we will show that X is close to a block 
identity matrix in this basis. 

4: Set A' to be the block diagonal part of H in the basis constructed in step (3) and set B' to the block identity 
matrix constructed in step (3), so that [A', B'] = 0. 

This algorithm involves several choices of constants. In a final section, ([v]), we indicate how to pick the constants to 
obtain the error bound ([I]). The key step will be step 3. 
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II. REDUCTION TO BLOCK TRIDIAGONAL PROBLEM 

The first two steps of the proof above (1,2) reduce theorem ([lj to theorem while the last two steps (3,4) prove 
theorem |2]). In this section we present the first two steps. 

A. Construction of Finite-Range H 

We begin by constructing matrix H as given in the following lemma, where the constant A will be chosen later. 

Lemma 1. Given Hermitian matrices A and B, with \\[A,B]\\ < S, for any A there exists a Hermitian matrix H 
with the following properties. 

1: \\[H,B}\\<5. 

2: For any two vectorsv\,V2 which are eigenvectors of B with corresponding eigenvalues x\,Xi, and with \x\ — x?\ > 
A, we have {v\,Hv2) = 0. 

3: H-4 — H\\ < e\, with t\ = cqS/A, where cq is a numeric constant given below. 

Proof. We define 

H = A J dtexp(iBt)Aexp(-iBt)f(At), (4) 

where the function f{t) is defined to have the Fourier transform 

/» = (l-c 2 ) 3 , H<1 (5) 
f(u) = 0, \u\ > 1, 

and hence the Fourier transform of /(At) is supported on the interval [—A, A]. Properties (1) and (2) follow imme- 
diately from Eq. Q. Property (3) follows from 

\\H-A\\ = \\(Ajdtexp(iBt)Aexp(-iBt)f(At)j-A\\ (6) 

= || A J dt (exp(iBt)Aexp(-iBt) -A)f(At)\\ 

< A J dt\\(exp(iBt)Aexp(-iBt) - A)\\ \ f(At)\ 

< A J dt\t\\\[A,B]\\ \f{At)\ 

< SA J dt\tf(At)\ 
= c 5/A, 

where we define the constant c by 

co = / dt\tf(t)\. (7) 

The second line in Eq.([6| follows because /(0) = 1 so that A f dtAf(At) = A. Note that since the first and second 
derivatives of f(co) vanish at lj = ±1, the function f(t) decays as 1/t 3 for large t and hence cq is finite. Since / is an 
even function, H is Hermitian. 

Note that the precise form of the function f(t) is unimportant: all we require is that /(0) = 1; that / is supported 
on the interval [—1,1]; that / is sufficiently smooth that f(t) decays fast enough for the integral over t (I7| to converge; 
and that / is an even function. □ 



4 



Remark: In a basis of eigenvectors of B, property (3) in the above lemma implies that H is "finite-range" , in that 
the off-diagonal elements are vanishing for sufficiently large |a?x — ^2]- The next theorem is a Lieb- Robinson bound for 
such finite range Hamiltonians, similar to those proven for many-body Hamiltonians [5HTTj . This result is also similar 
to results on the decay of entries of smooth functions of matrices proven in [TJJ [13] . 

We now introduce some terminology. Given two sets of real numbers, S±, S2, we define 

dist(5i, S2) = min \x± — a^l- (8) 

Remark: The reason for introducing this "distance function" is that we think of H as defining the Hamiltonian 
for a one-dimensional, finite- range quantum system, with different "sites" of the system corresponding to different 
eigenvectors of B, and then the distance function is the distance between different sets of sites. 

Further, we say that a vector w is "supported on set S for position operator B" if w is a linear combination 
of eigenvectors of B whose corresponding eigenvalues are in set S. Finally, for any set S we define the projector 
P(S, B) to be the projector onto eigenvectors of B whose corresponding eigenvalues lie in set S. We now give the 
Lieb-Robinson bound: 

Theorem 4. Let H have the properties 

1: \\H\\ < 1. 

2: For any two vectorsvi,v 2 which are eigenvectors of B with corresponding eigenvalues x±, and with \x% — X2I > 
A, we have {v\,Hv2) = 0. 

Define 

v L r = e 2 A. (9) 
Then, for any vector v supported on a set Si for position operator B , and for any projector P(S2, B), we have 

\P(S 2 , B) exp(-iHt)v\ < e -dist(s 1 ,s 2 )/A| u | (10) 

for any 

\t\<dist(S 1 ,S 2 )/v LR . (II) 

Proof. Expand c~x.p(—iHt)v in a power series as v — iHtv — (H 2 /2)t 2 v + .... Then, by assumption, 
P(S 2 , B)(-it) n (H n /n\)v vanishes for n < dist(5i, S 2 )/A. Let m = [dist(5i, 5* 2 )/A] . Then, 

^(-it) n (H n /n[)v\ < ^2(\t\ n /nl)\v\ (12) 



n>m 



e 

7t >m 



< -(e\t\/mr- \v\. 

e 1 — e\t\/m 

For the given vlr and t, the result follows. □ 

Remark: the proof of this Lieb-Robinson bound is significantly simpler than the proofs of the corresponding 
bounds for many-body systems considered elsewhere. The power series technique used here does not work for such 
systems. 

B. Construction of X 

In this subsection, we construct the matrix X from B. We define a function Q(x) by 

Q{x) = ALt/A + 1/2J. (13) 

Then, we set 

X = Q(B). (14) 
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Note that \Q(x) — x\ < A/2 for all x, and Q(x)/A is always an integer. Then, 

||X-S||<e 2 , (15) 

with 

e 2 = A/2. (16) 

By (2) in lemma ([!]), the matrix H is a block tridiagonal matrix when written in a basis of eigenstates of X, with 
eigenvalues of X ordered in increasing order. 

III. CONSTRUCTION OF NEW BASIS 

In this section we construct the basis to make H close to a block diagonal matrix and X close to a block identity 
matrix. This completes step (3) of the construction of A' and B' . We refer to the basis constructed in this step as 
the "new basis" and we refer to the basis in which X is diagonal as the "old basis" . 

There will be a total of n cut + 1 different blocks in the new basis, where n cu t is chosen later. Before constructing the 
new basis, we give some definitions. We define an interval by Ii = [— l + 2(i — l)/n cut , — l + 2i/n cu t) for 1 < i < n cut 
and Ii = [—1 + 2(i — 1) /n cut , —1 + 2i/n cut ] for i = n cut . Let Ji be the matrix given by projecting H onto the subspace 
of eigenvalues of X lying in this interval Ii, and call this subspace Bi- Then, in the old basis of eigenvalues of X, 
Ji is block tridiagonal with at least L different blocks, where L = [(2/n cut A) — lj (some of these blocks might have 
dimension zero if B happens to have fewer than L distinct eigenvalues in that interval). We will choose n cut later so 
that L >> 1 and so the new basis will have fewer blocks than the old basis. Before constructing the new basis we 
need the following lemma. 

We claim that: 

Lemma 2. Let J be a Hermitian block tridiagonal matrix, with \\J\\ < 1, acting on a space B. Let there be L blocks, 
so that the space B has L orthogonal subspaces, which we write Vj for j — 1, L, with (v, Jw) — for v £ Vj and 
w £ Vj with \i — j\ > 1. Then, there exists a space W which is a subspace of B with the following properties: 

1 : The projection of any normalized vector v £ V\ onto the orthogonal complement of W has norm bounded by £3 
where £3 is equal to I/L 1 / 3 times a function growing slower than any power of L. 

2: For any normalized vector w £ W, the projection of Jw onto the orthogonal complement ofW has norm bounded 
by £4, where £4 is equal to 1/L 1 / 3 times a function growing slower than any power of L. 

3: The projection of any normalized vector v £ Vl onto VV has norm bounded by £5, where £5 is equal to a function 
decaying faster than any power of L. 

Proof. This lemma is the key step in the proof of the main theorem, and the proof of this lemma is given in the next 
section. □ 

For each i, 1 < i < n cut , we apply lemma (2| to the matrix J — Ji defined on the space B = Bi. For given i, we refer 
to the space W as constructed in lemma |2| as Wi and we refer to the subspaces Vj defined in lemma pi as Vj(i). 
Let Bi have dimension Dg(i) and let Wi have dimension Dyy(i). Let denote the Db(i) — -Dvy(*)-di m ensional 
space which is the orthogonal complement of Wi- Let Vj(i) have dimension dj(i). By properties (1,2) in lemma (pj), 
D B {i) > di(i) and D B (i) < D w (i) - d L {i). 

The new basis has n cut + 1 blocks, which we label by i — 0, 1, n cut . For 1 < i < n cut , we define the z-th block 
of the new basis to be the space spanned by Wi+i and VV,- . For i = 0, the i-th block is the space Wi+\ — Wi- For 
i = n cut , the z-th block is the space W/" = W^. . 

Then, the matrix H is a block band matrix in this new basis. The matrix H will have terms on the main diagonal, 
on the diagonals directly above and below the main diagonal, and on the diagonals above and below those, so it only 
has terms within distance two of the main diagonal. The block-off-diagonal terms above and below the main diagonal 
arise from three sources. First, the matrix Ji contains non-vanishing matrix elements between the spaces Wi and 
Wi~, and those spaces are now in different blocks. However, by property (2) in lemma ([2]), these matrix elements 
are bounded by £4. Second, there are non-vanishing matrix elements between the subspace W^_ l and V\, and V\ 
may not be completely contained in subspace Wi. However, by property (1) in lemma these contribute only £3 
to the norm of the block-off-diagonal terms of H in the new basis. Third, there are non-vanishing matrix elements 
between Wi and V l L , and V % L may not be completely contained in subspace W^ ■ However, by property (1) in lemma 
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([2]), these contribute only £5 to the norm of the block-off-diagonal terms of H in the new basis. Therefore, these 
block-off-diagonal terms in H arc bounded in operator norm by 



2(e 3 +e 4 + e 5 ). 



(17) 



The terms which are on the diagonals two above or two below the main diagonal arise from the fact that there are 
non- vanishing matrix elements between V % L and VJ" 1 " 1 , and V % L has some non- vanishing overlap with Wi and V{ +1 has 
some non- vanishing overlap with Wi x . These terms are bounded by £365. 

Define B' to be the block identity matrix (in the new basis) which is equal to —1 + 2i/n cut times the identity matrix 
in the i-th block. Since each block i in the new basis lies within the space spanned by Bi and Bi+\ we have 



\B' -B\\ < 2/n c 



(18) 



Remark: Here is a sketch of the above procedure, in a case where H has 8 blocks and n cut = 2. The matrix 
originally looks like 



/ 



V 



(19) 



where the ... indicate non- vanishing entries. We combine the entries in the first 4 blocks into a matrix Ji and the 
entries in the last 4 into a matrix J2 so H looks like 



J-2 



(20) 



where the ... couples only the L-th block of space B\ to the 1-st block of space B^- Then, we apply lemma 2 to 
decompose B\ into spaces Wi and W-j 1 so that J\ looks like 



0(e 4 ) 



0(e 4 ) 



(21) 



and similarly for J 2 . Inserting Eq. (21 1 into Eq. (20), H looks like (in the new basis) 



... 0(e 4 ) 0(c B ) 0(e 3 e 5 )\ 

0(e 4 ) 0(e 3 ) 

0(e 5 ) 0(e 4 ) 

yO(e 3 e 5 ) 0(e 3 ) 0(e 4 ) ... 



(22) 



which is close to the block diagonal matrix, 



V 



(23) 



which has 3 = n c 



1 blocks. 



IV. PROOF OF LEMMA [2] 



Let the space V% be d\ dimensional, with orthonormal basis vectors v%, ...^v^. Let S denote the Ds-hy-di matrix 
whose columns are these basis vectors, so that S is an isometry. 

Define a function ^"(wq, r, W, to) as follows. Let .F(0,0, 1,uj) = 1 for w = 0. Let J-"(0, 0, 1, cj) = for \u\ > 1. 
Let ^(0,0,1,0;) = ^"(0, 0, 1, — uj). For < u) < 1, choose ^-"(0, 0, 1, u) to be infinitely differentiable so that the 



a) 



b) 




FIG. 1: Sketch of (a) J"(0,1,1,cj) = F(-l, 0, 1, u>) + F(Q, 0, w) + F(l, 0, 1, w) and (b) .F(Q,0,l,w). 



Fourier transform of .F(0, 0, l,w), which we write F(0, 0, l,t), is bounded by a function which decays faster than any 
polynomial. Finally, we impose .F(0, 0, 1, to) + F(0, 0, 1, 1 — oS) = 1 for < to < 1. 

For general Wo,r, w, define the function J-(ujQ,r,w,uj) by J-(u)q, r, w, uS) — 1 for \lo — uq\ < r, and ^"(wq, r, io, o;) = 
J-"(0, 0, 1, (\u> — luq\ — r)/w) for \to — ojq\ > r. Then F(ujq, r, w,u>) — for |o; — o>o| > r + w. For r > and w > 0, 
the function ^"(wo, r, if, o;) is infinitely differentiable with respect to uj everywhere. The functions .F(0, l,l,w) and 
J-"(0, 0, l,a;) are sketched in Fig. la,b; the variable r denotes the width of the flat part at the center of the function, 
while w denotes the width of the changing part of the function. Since .F(0, 0, l,u>) is infinitely differentiable, there is 
a function T(x) which decays faster than any polynomial such that: 



dt\F(u) ,w,w,t)\ < T(wt ), 



(24) 



|t|>*0 



d*|^(wo,0, <T(wt ). 



\t\>t 

The operator norm of J is bounded by 1. The idea of the proof is to divide the interval of eigenvalues of J, which is 
[—1, 1], into various small overlapping windows. Then, for each interval centered on a frequency lo, we will construct 
vectors given by approximately projecting vectors in Vi onto the space spanned by eigenvectors of J with eigenvalues 
lying in that interval; we call the spaces of these vectors A^, where i labels the particular window. Then, each of these 
projected vectors x will have the property that Jx is close to lox. This will be the key step in ensuring property (2) 
in the claims of the lemma. The idea of approximate projection is important here. In fact, we will use the smooth 
filter functions T{ojq, r, w, uj) above. The smoothness will be essential to ensure that the vectors x have most of their 
amplitude in the first blocks rather than the last blocks. Since the vectors in the spaces Xi are approximate projections 
of vectors in V\ into different windows, we will be able to approximate any vector v\ € Vi by a vector in the space 
spanned by the Xi simply by adding up the projections of V\ in each different window. Because the windows overlap, 
the vectors may not be orthogonal to each other; the overlap between vectors is something we will need to bound (see 
Eq. (49) below). To control the overlap, we choose W to be a subpace of the space spanned by the Xi as explained 



below; this will then require us to be careful to ensure that we are still able to approximate vectors in Vi by vectors 
in W. 

Let n W in be some even integer chosen later. We will choose 



L/F(L), 



(25) 



where the function F(L) is a function that grows slower than any power of L and is defined further below. The choice 
of function F{L) will depend only on the function T{x) defined above. 



For each i = 0, n v 



Define 



1, define 



uj(i) = -1 + 2i/ (n wm - 1). 



2/ {f^win 1) i 



(26) 



(27) 



so ui(i) = —1 + IK. 

When u>(i) and k are chosen as above, we have X)™=(T _1 •^ r ( w W' 0j k,oj) — 1 for — 1 < oj < 1. See Fig. 2(a) to see a 
sketch of three functions J-(ui(i — 1), 0, k, uj), F(ui(i), 0, k, uj), and F(u)(i + 1), 0, k, uj); as F(u>(i), 0, k, uj) decreases for 
u>(i) < uj < uj(i + 1), the function F{u>{i + 1), 0, k, uj) is increasing to keep the sum constant. 



a) 




b) 




FIG. 2: a) Sketch of overlapping windows, b) Re-arrangement of windows as discussed in section on tridiagonal matrices. 

A. Construction of Spaces X, 

To construct Xi, we define the matrix Tj by 

n = 7(u(i),0,K,J)5. (28) 

Define 

^min — 1 1 ' {y^win-L )• (^^) 

Compute the eigenvectors of the matrix t\t%. For each eigenvector x a with eigenvalue greater than or equal to A m j n 
compute y a = TiX a - Let Xi be the space spanned by all such vectors y a . Let Zi project onto the eigenvectors x a with 
eigenvalue less than A TO j„; the projector Zi will be used later in computing the error estimates. 

Remark: To understand this construction, in Fig. 2a we sketch the functions J-(oj(i — 1), 0, ft, u>), J-(uj(i), 0, k, w), 
and jF(uj(i + 1), 0, fc,w), which form partially overlapping windows. Note that the vectors JF(uj{i), 0, k, J)Sx\ and 
± 1), 0, k, J)Sx2, for arbitrary xi, X2, need not be orthogonal. 

B. Properties of Xi 

This subsection establishes certain properties of the Xi. It is primarily intended to motivate the construction thus 
far. We will show that the Xi have three properties which are closely related to the three properties we desire to show 
in lemma ([2]). 

First, for any normalized vector t) 6 Vi, the projection of v onto the orthogonal complement of the space spanned 
by the Xi is bounded by y/2n W i n \ m i n = ^/2/L. To show this, for any t) € Vi, with \v\ = 1, we write v = Sx with 
\x\ = 1, and then 



J2 nii-z^x] 2 = | J2 T ^l 2 ( 3 °) 

n win -l 

< 2 J2 V^A 2 



i=0 i=0 

n win -l 



i=0 



< 2/L 2 . 

The factor of 2 in the first inequality follows because (TiZiX, TjZjx) — for \i — j\ > 1, but may be non-vanishing for 
i = j±l. Similar factors of 2 occur in several other places. 

Second, each space Xi is an approximate eigenspace of J. That is, for any Vi G Xi, we have 



|(J - u)(i))vi\ < n\vi\. 



(31) 
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Third, for any vector y G A^, the norm of the projection of y onto Vl is bounded by \y\ times a function growing 
slower than any power of L. It is here that we will pick the function F(L) and use the Lieb-Robinson bounds. Let 
F(ojq, r, u>, t) denote the Fourier transform of /(woj^WjOj) with respect to the last variable uj. Then, for any x with 
(1 — Zi)x — x, we find that y — TiX — J-"(u;(i), 0, k, J)Sx is equal to 



y= dtF(uj(i), 0, k, t) exp(i Jt)Sx. 



(32) 

We use the Lieb-Robinson bounds for matrix J, by defining a position matrix which is equal to i in the i-th block. 
Using the Lieb-Robinson bounds, for time t < L/vlr, with vlr = e 2 , we find that the norm of the projection of 
exp(iJt)Sx onto the space Vl is bounded by exp(— L). At the same time, the integral f\t\>L/v LR dtJ r (w(i), 0, k, t) is 

bounded by T(2L/vLRn W i n ) — T(2F(L)/e 2 ). Since T(x) decays faster than any negative power of x, we can choose 
an F(x) which grows slower than any power of x such that T(2F(L)/e 2 ) still decays faster than any negative power of 
L. Thus, since |y| > A m i„|ir| by construction, for this choice of F(x) the norm of the projection of any vector y £ Wi 
onto Vl is bounded by \y\ times a function decaying faster than any negative power of L. 

The reason for picking X m i n > is to help establish the third property above. Let us give an example of a situation 
where we would encounter problems if we have taken X m i„ = 0. Consider a matrix of the form 



/ 1/4 
1/4 1/4 

1/4 1/4 
1/4 



V 



\ 



1/4 



1/4 1/4 
1/4 1/2/ 



(33) 



Here, each block has size one. If it weren't for the "1/2" in the last line, this matrix would have operator norm slightly 
less than 1/2. However, because of the 1/2, this matrix has one eigenvalue greater than 1/2. For this particular choice 
of matrix, this eigenvalue is close to 5/8. The corresponding eigenvector is localized near the last block, and is 
exponentially small in the first block. If we project a vector in Vi into a narrow window centered on uj(i) — 5/8, the 
result will project onto this eigenvector, and thus the resulting state will have large amplitude on Vl- However, for 
such a window, we would find that n would be exponentially small, and so we would not include this vector in X^. 

The properties we have established for spaces Xi are closely related to the properties in lemma ^ that we are 
trying to establish. Unfortunately, the spaces Xi need not be orthogonal, and in fact may be very far from orthogonal. 
This can lead to problems like the following: suppose we have two vectors, v\ <G X\ and t> 2 € Xi. We know that the 
projection of v\ onto Vl is small compared to \v\\, and we know the same thing for V2', however, we don't know that 
the projection of Vi + «2 onto Vl is small compared to \v\ + because we don't know how \v\ + W2I compares to 
|i>i| and \v2\- We have two different ways of dealing with this: in the next subsection, we present a construction for 
block tridiagonal matrices that involves choosing a subspace of the space spanned by the Xi. In a later section on 
tridiagonal matrices, we present a much simpler construction that involves combining several windows into one; the 
reader may prefer to read that section first. 



C. Construction of W 



We now construct the space W. Let each space X4 have dimension Dj. In each space X^ we can find an orthonormal 
basis of vectors, v.i^, for b = 1,...,_D;. We define a block tridiagonal matrix p of inner products of vectors u^j as 
follows: the z-th block (for < i < n W i n ) has dimension Di, and on the diagonal the matrix is equal to the identity 
matrix. Above the diagonal, the block in the i-th row and i + 1-st column is equal to the matrix of inner products 
( v i,b: ^i+i,c) fo r b = 1, - - - , -Di and c = 1, Z) i+1 . Note that for \i — j\ > 1, the spaces Xi and X~ are orthogonal, 
so that the matrix p is block tridiagonal. We define a new vector space TZ to be a space of dimension X)™=o" 1 D%- 
The matrix p is Hermitian and positive semidefinite. It is equal to p = A, for some matrix A which is also block 
tridiagonal. The matrix A is a linear operator from TZ to B; it is simply a matrix whose columns, in a given block, 
are different basis vectors for the space Xi corresponding to that block. 

Remark: The matrix p is block tridiagonal. To motivate what follows, consider the following circular reasoning: 
given that p is block tridiagonal, if we knew that theorem ^ were true, we could find a basis in which p was 
approximately diagonal and in which a position operator, a block diagonal matrix equal to i/n W i n in the i-th block, 
was also approximately diagonal. Then we choose W to be the space spanned by vectors of the form Awi , where 
are basis vectors in this basis for which the diagonal entry of p are not too close to zero (how close is something we 
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FIG. 3: Sketch of which blocks are in which subspaces 3^ for n s b = 6, as well as which blocks are in the range of the Pi. 



would pick later). Then, we would know that the vectors Awi and Avjj are not degenerate for i ^ j, and the operator 
A would be an approximate isometry from the space spanned by the Wi to W. Also, we would find that Awi was an 
approximate eigenvector of J. We would know that any vector v G Vi had small projection orthogonal to W, since v 
could be written as Sx and, while x may have some projection onto vectors u>i for which the diagonal entry of p is 
very close to zero, the error in v we make by dropping those vectors from x is small. This would give the space W 
the properties we are trying to construct. Unfortunately, of course, we are trying to prove theorem (j2j), so this line 
of reasoning does not help. However, we do not need such a strong result in the present construction as will be seen 
below. Importantly, if there is a vector w such that (w, pw) is small, it leads to only a small error in our ability to 
approximate vector in v G Vi if the vector w is orthogonal to the space W. We will also make use of a related fact: if 
there is a vector w = W\ + u>2 such that (w\ + W2, p(wi + 102)) is small, then this means that Aw\ is close to —Awi- 
Suppose Aw\ G X\ and Aw 2 G X 2 . Then, we can take W to be the space spanned by X 2 , X 3 , ... and spanned by the 
subspace of X\ orthogonal to Aw\, and this leads to only a small error in our ability to approximate vectors v G V\ 
by vectors in W. This is the basic idea behind the construction that follows. 

We define spaces X, for i = 0, ...,n s b — 1, as follows, where n s i, is the smallest even integer greater than or equal to 
ft-win/h with the "block length" 4 being an integer equal to 

h=[nJt n }- (34) 

Here, "sb" stands for "super-block" as we combine several blocks into one superblock. We pick 3^ to be the subspace 
of 1Z spanned by the vectors in blocks from the (i — l)Z&-th block to the (i + 1)Z& — 1-th block. That is, it is the 
subspace spanned by vectors in blocks (i — l)lb, (i ~ l)h + — l)h + 2, (i + l)k — 1. Therefore, 3^ is orthogonal 
to yj for \i — j\ > 1. The space spanned by the Xi for i = 0, n W i n — 1 is the same as the space spanned by the Aj^ 
for i — 0, n s i, — 1; we will choose the space W to be a subspace of this space. Let Pi project onto the subspace of 
1Z spanned by the blocks from the ilb-th block to the (i + l)lb — 1-th block. For notational convenience later (and to 
avoid various off-by-one errors), we define P_i = 0, and we define Xi for i < to be the empty set. 

In Fig. 3 we sketch the blocks used to define the spaces J^i for the case n s t = 6. The horizontal position in the 
figure indicates increasing block number, as marked in the top row. Space overlaps with space 3^±i, as seen. We 
have also sketched the range of the operators Pi. 

We claim that 

Lemma 3. There exist spaces Mi, for i = 0, ...,n s b — 1 with the properties that: 
1 : Mi is a subspace of yi . 

2: For any vector v G Mi, the quantity (v, pv) is bounded by \v\ 2 /l^ times a function F (lb), which is growing slower 
than any power of lb ■ 

3: Let Ni project onto Mi- For any vector v which is in the space spanned by eigenvectors of p with eigenvalue less 
than 1/1%, the sum J^. |A^w| 2 is greater than or equal to (1 — Fi(lb))\v\ 2 , where -Fi(^) is a function decaying 
faster than any negative power of lb ■ 
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4: For any vector v which is in the space spanned by eigenvectors of p with eigenvalue less than 1 we can find 
vectors Ui € Mi such that 

v = ^2n l +w ± , (35) 

i 

with w 1 - bounded by a quantity going to zero faster than any power of lb, and Re((rc.j, rij+i)) > —F2(l b )\v\ 2 , 
where F 2 {lb) goes to zero faster than any power of l b and where Re(...) denotes the real part of a quantity. 

Proof. Define a matrix M by 

M = ( A ° t . (36) 

Define a matrix O by 

O = J dt exp(iMt)T(0, G(l b )/l b , G{l b )/l b , t), (37) 

where G{l b ) is a function growing slower than any power of l b to be chosen later. Note that since T is even in t, O 
is non-vanishing only in the upper left and lower right corners. We define Oo to be the top left corner of O. For use 
later, define the matrix Q by 

Q = (1 0) . (38) 

For each i, define Ai to project onto the blocks with (i — 1/2)4 < j < (i + l/2)Z&. Compute the eigenvalues of AiO^Ai. 
For each eigenvalue A a greater than or equal to A , for some Ao chosen later, compute the corresponding eigenvector 
x. The quantity A will be chosen later to go to zero faster than any power of l b . Let Mi be the space spanned by 
(Pi-i + Pi)Oox, for all such x. Mi is a subspace of 3^, as claimed. 

Let v be any vector in Mi- By definition, v = (Pi-i + Pi)Oox, for some x. By construction, {Oqx, pOox) is bounded 
by (2G(l b )/l b ) . We now use the Lieb-Robinson bounds for matrix M, by defining a position matrix which is equal 
to i in the z-th block. Using the Lieb-Robinson bounds, for time t < l b /2vLR, with vlr = e 2 , we find that the 
norm of (1 — Pi-i — Pi)Q cxp(iMt)Q T x, for x in the range of Ai, is bounded by exp(— l b /2). At the same time, 
the integral J, t , >lb , 2vLR dtJ r (0,G(l b )/l b ,G(l b )/l b ,t) is bounded by T{G{l b )/2vLR)- Since T(x) decays faster than any 
negative power of x, we can choose a G{x) which grows slower than any power of x such that T{G{l b ) /2vlr) still 
decays faster than any negative power of l b . Thus, since \y\ > \o\x\ by construction, we can find a Ao going to zero 
faster than any power of l b such that \(Pi-i + Pi)OoX — Oqx\ goes to zero faster than any power of l b . Thus, 

|K HI < \{O x,pO x)\ + 2\{O x,p{l - Pi_! - Pi)O x)\ + ((1 - - Pi)O x,p(l - Pi_! - Pi)O x). (39) 

Since p is bounded in operator norm, we have bounded all terms in the above equation. This verifies 2. 

Also, for any vector v which is a normalized eigenvector of p with eigenvalue less than l/l b , we can write v = a,t>j, 
with Vi a normalized vector in the space projected onto by Ai and a, a set of complex amplitudes such that \o,i\ 2 = 1. 
Then, since Oqv — v, we have |ai| 2 |(OoWi, w)| 2 = \v | 2 = 1. Decompose each Vi as y,i + Zi where j/j is the projection 
of Vi onto eigenvectors of AiO^Ai with eigenvalue greater than or equal to Ao and Zi is the projection onto eigenvectors 
with eigenvalue less than Ao- Note that \OoZi\ is bounded by ^/Xo■ Thus, \OoVi — OoVi\ is bounded by t/\q. Thus, 
since ^2i\ai\ 2 \(OoVi,v)\ 2 — 1, we have ^ |(Oo a iJ/i7 v )\ 2 > 1 — 2^/Xo. Further, using the Lieb-Robinson bounds, 
|(-Pi-i + Pi)Ooyi — 0{)yi\ is bounded by a quantity going to zero faster than any power of l b . Thus, given the 
lower bound on O yi that \O yi\ > ^fX), we can pick a A going to zero faster than any power of l b such that 
\{Pi-i + Pi)Ooyi — Ooyi\/\OoVi\ is bounded by a quantity going to zero faster than any power of l b . Thus, the angle 
between Ogyi and |(-Pj-i + fj)Oo2/i| goes to zero faster than any power of l b , and so the projection of Ogyi into 
is equal to OoJ/i plus a vector going to zero faster than any power of l b . Thus, lA^vl 2 is greater than or equal to 
l a i| 2 |(OoJ/i, v)\ 2 minus a quantity going to zero faster than any power of l b . This verifies 3. 

The proof of 4 is similar to 3. Write v — J^i a i v i as m the above paragraph. As before, decompose Vi = yi + Zi. 
Then, \v — Yl^Pj-i + Pi)Ooyi\ is bounded by a quantity going to zero faster than any power of l b . Let rii = 
^2i(Pi-i + Pi)Ooyi- Then, rij e Mi as claimed and we have upper bounded \w | = \v — J2i n i\ as claimed. Now 
consider (ni,n i+1 ). This is equal to {Y,j<i n hHj>i n j)- Note tnat \Hj<i n j\ = \ J2j<i( p j-i + Pj)OoVi\- However, 
| (^2j<i(Pj-i + Pj)Ooyi*j — Oq Y]j<-j Ui\ is upper bounded by |u| times a quantity going to zero faster than any power 
of l b . Thus, since ||Oo|| < 1 ; w e have upper bounded \ J2j<i n j\ by \ J2j<i v j\ plus \i>\ times a quantity going to zero 
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faster than any power of lb- Similarly, we have upper bounded | J2j>i n j \ by | J2j>i v j \ Pl us \ v \ times a quantity going 

.n--l 2 This 



to zero faster than any power of 4- Now consider | ^ ■ n,j\ 2 . This is equal to 



| nj | 2 - | J2 n 3 1 2 + | J2 n s | 2 + 2Re((n 4 , n l+1 )). (40) 

j j<i j>i 

We have upper bounded the first two terms on the right-hand side of the above equation by | J2j<i v j\ 2 + I T^j>i v j\ 2 ^ 
plus |i>| 2 times a quantity going to zero faster than any power of lb- Since | J2j<i v j\ 2 + I Sj>i v j\ 2 — M 2 > an( i since 
| J^j n j ~ v \ i s bounded by a quantity going to zero faster than any power of lb so that | J2j n j\ 2 is superpolynomially 
close to \v\ 2 , we have lower bounded Re((n i; n i+ i)), as desired. □ 

Remark: The definition of M and O as block matrices in the above lemma is simply a trick to make the claims 2, 3 
in the lemma depend on l^ 2 rather than l^ 1 as we would have found without this trick of introducing block matrices. 
In physics jargon, near the edge of the band (eigenvalues close to zero for p which is a positive semi-definite matrix), 
we have dynamic critical exponent 2 rather than 1. 

We now describe the construction. The numeric constant rj below is some sufficiently small, positive, real number; 
the choice of this number will be discussed in lemma Q. 

We iteratively construct a sequence of spaces Af- for odd i which are subspaces of A/i as follows. For each i = 1,3,..., 
consider the space A/i- Let Q t be the projector onto the span of A/i+i,A/i-i andA/"/_ 2 (if i = 1, then Af'— 1 is considered 
to be the empty space). Apply Jordan's lemma 28] to Ni and to Qi to construct a complete orthonormal basis n^t, for 
A/i such that (n^, Qin itC ) = for b ^ c. Let A/"/ be the space spanned by vectors n i: b such that \Qin it b\ 2 < 1/2 + rj. 
Define IA^ to be the space spanned by the A/i for even i and by the A/i' for odd i. Define U to be the subspace of 7Z 
orthogonal to W 1 . We now define W = ALA. Let P be the projector onto W. We define U to be the projector onto 
U. Thus, for any vector w with Pw = w, we have w = Av, for some v with Uv = v. 

We claim one important property of this space IA : 

Lemma 4. Let rj be a sufficiently small positive number. Let Xi be any vector in 3^. Consider the vector Uxi. Project 
this vector into space . The norm of the resulting vector is bounded by 

const, x exp(— \i — j\/const.)\xi\, (41) 

for some positive numeric constants. 

Proof. We consider instead the vector (1 — U)xi and bound the norm of the projection of that vector. Since the space 
IA^ is the span of spaces Afj for j even and A/i- for j odd, the projection (1 — U)x{ can be computed by minimizing 
the quantity 



E 



ajUj — Xi\ (42) 



3 



over all \rij\ in A/} for j even and rij in Afj for j odd, with \rij\ — 1, and over all complex numbers aj. Then, 
(1 — U)xi = a j n j- F° r given Xi, let the minimum be obtained for some definite choice of vectors nj. Then, we 



consider the minimum over ctj of ( 42 ) . We can write Eq. ( 42 ) in a matrix form, by introducing a tridiagonal matrix 



M, with diagonal entries equal to unity and entries M^+i = (n^nj+i). Then, define a vector x which has its j-th 
entry equal to (rij,Xi). Define a vector a, with j'-th entry equal to aj. Then, the minimum over aj is given by 

a = Ar 1 x. (43) 

We will prove an exponential decay on entries of M _1 . That is, we will define G = M _1 and prove that the matrix 
element Gij decays exponentially in \i — j\. Then, since the only non-zero entries of x are the i— l,z, and i + 1 entries, 



this will prove the desired result (41) 



To prove this decay, it suffices to prove a lower bound on the smallest eigenvalue of M. By construction, we have 
the property that for odd i, 

\M hl+1 \ 2 + \M l ^ 1 \ 2 /(l- |A/,_ M _ 2 | 2 ) < (1/2) + v , (44) 
since |M iji _i| 2 /(l — |M i _i, i _ 2 | 2 ) is the projection of rii onto the span of rii-x and «i_2- It suffices to prove the 



lower bound on the smallest eigenvalue for matrices M that saturate Eq. (44), that is, those for which the inequality 



becomes an equality. Further, it suffices to prove the lower bound for matrices the case r\ = 0, since any matrix that 



obeys Eq. ( 44 ) with non-zero r\ is close in operator norm (the distance depends on 77) to a matrix that obeys Eq. ( 44 ) 
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with 77 = 0, and hence if there is a lower bound on the smallest eigenvalue for r\ = 0, then we have a lower bound on 
the smallest eigenvalue for sufficiently small r\. 

We prove the lower bound in the case 77 = as follows. We choose A to be some small number. We consider the 
matrix M — XI, where / is the identity matrix, and let let M( n ) denote a sub-matrix of M — XI, containing the first 

n rows and columns, and define G^ to be M7\. Let G„ be the matrix element G^%- We will prove an upper bound 
on G n for all n, for some non-zero value of A, in the case 77 — 0. This will prove the lower bound. 
To compute G n , we have a recursion relation: 

r 1 1 ^G n - 2 y 2 /{l-X-y 2 ) 2 

" 1 - A - y 2 + 1 - ^G„- 2 /(l -\-y*Y 1 j 

where x = |M n _ 2 ,„-i| and y = |M n _i, n |. 

We can apply this recursion relation twice to compute G n from G„_4. Let us call a = |iW n _4 i „_3|, b = |M n _3, n _2|, 



x = \M n -2, n -i\, and y = |M„_i jn |. Then, Eq. (44) implies that (taking 77 = 0) 



a 2 + b 2 < 1/2, (46) 
y 2 +x 2 /(l-b 2 ) < 1/2. 

For what follows, it suffices to consider the case in which the inequalities in the above equation become equalities: 

a 2 + b 2 = 1/2, (47) 
y 2 +x 2 /(l-b 2 ) = 1/2. 

We do an inductive proof of the bound on G n . For use in the induction, we say that "property * holds for G n " if 
two conditions hold. First, G n < 2.1 and second, if G n > 1.9 then |M„ in _i| > 0.54. The following claims were checked 
by computer assistance: we choose a fine grid on possible values of b, x (in this cas e, every possible value between 
and 1 with a spacing of 10~ 4 ). We then computed the values of a, y from Eq. ( |47| , and verified that certain claims 
held; we then verified that the chosen spacing was sufficiently fine that other claims held for all possible choices of 
b, x. 

We have chosen A = 0.02, and have verified that, for all b,x, if G„_4 < 1.9, then G n < 1.9 unless y > 0.54 (this 
was done by verifying a stronger claim for all b, x on the grid, namely that G n < 1.89, and then using smoothness 
properties of the functions). We further verified that assuming G„_4 < 1.9, then the maximum possible value of G n 
was at x = and gave 2.08333... < 2.1. Thus, we verified that if G„_4 < 1.9, then property * holds for G n . For use 
in the inductive result below, we call this result the "first simulation" . 

Next, we assumed that b > 0.54, and that G„_4 < 2.1, and again verified that G„ < 1.9 unless y > 0.54, and that 
the maximum possible value of G„ was at x = and gave 2.08333... < 2.1. That is, we verified that if G„_4 < 2.1 
and |M n _2,n-3| > 0.54 then property * holds for G„. Again, this was done by verifying stronger statements for all 
b, x on the grid and then using smoothness. We call the results in this paragraph the "second simulation" . 

Finally, we verified that if G„ < 1.9, then for any choice of M nj „ + i and M„ + i, n+2 with |M„, n+ i| 2 + |M„+i, rl +2| 2 < 
1/2, we have G„ + 2 < 2.1. This is the "third simulation". 

Now we prove that G m is bounded by 2.1 all even to. Start with Go- Since Go = 1, we have that property * holds 
for Go- Now, consider any even to and assume that property * holds for G m . Also assume that G; < 2.1 for all even 
I < to. Then, if |Af m m _i| > 0.54, we use the inductive assumption that G m _ 2 < 2.1 and the second simulation above 
(taking n = m + 2 so wc compute G m +2 from G m -i, in which case the assumption |M m m _i| > 0.54 means that 
b > 0.54) to show that either G m+2 < 1.9 or G TO+2 < 2.1 and \M m+ i, m+2 \ > 0.54. That is, if |M m>m _i| > 0.54, we 
show that property * holds for G m +2- On the other hand, if |M m m _i| < 0.54, then G m < 1.9. In this case, by the 
third simulation, G m+ 2 < 2.1, and by the first simulation (now taking m = n), either G m+ 4 < 1.9 or G m+ 4 < 2.1 and 
m +4\ > 0.54, so that property * holds for G m +4. Note that this induction is non-standard: we assume that 
property * holds for G m , and we prove that the same hypothesis holds for either to + 2 or for to + 4. However, we 
prove the bound that G m < 2.1 for all even to. This bounds G m for A = 0.02; using monotonicity properties of the 
recurrence of G, we then bound it for all A < 0.02. □ 

Remark: The existence of a lower bound on the smallest eigenvalue is not surprising given the following argument. 
The argument in this paragraph is intended as heuristic, but could perhaps be turned into an alternate proof of the 
lemma above. Consider weakening the conditions on the off-diagonal matrix M to the condition that |M„_2, n -i| 2 + 
|-Wn-i,n| 2 < 1/2 for even n. One may show in this case that G n is bounded by 2 for all even n. To see this, assume 
that G„_2 is bounded by 2. Maximizing G„ over M n _ 2 ,„-i and M n _i jn subject to the constraint gives that G„ is 
bounded by 2. Thus, using induction G„ is bounded for even n. Now, G„ may become infinite for odd n in this case 
(suppose that Mq,i = 0, Mi. 2 = l/v2 ) M2 l 3 = l/y2, M3,4 = 0, so that G3 is infinite). One may similarly show in 
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this fashion that M is positive semi-definite (simply repeat this inductive argument for the matrix M + zl for any 
real z > to show that G n is bounded for all real z > 0). Now, in the lemma above we assumed stronger constraints 
on M; it is perhaps not surprising then that adding the stronger constraints means that rather than being merely 
positive semi-definite, M has a lower bound on its smallest eigenvalue. 



D. Properties of W 



In this section we establish certain properties for the space W. The main results are Eq. (49), controlling the 



overlap between vectors in this space, and Eq. (50), showing that for any vector v in the space spanned by Xi with 
v = Ax, the vector Pv £ Wis close to v, where the maximum distance \Pv 
First Property- 



v\ between the vectors depends on |x| 
By construction, for any vector r £ hi, with \r\ = 1, we have 



(r,pr) > const, x (l/£ 2 ), 



(48) 



for sufficiently large lb- 

To show Eq. ( 48 1 , we bound the inner product between r and w for w in the span of eigenvectors of p with eigenvalue 
less than or equal to l/l 2 - Any such w can be written as a linear combination of vectors w even in the span of Mi 
for even i and 

w odd in 

the span of Mi for odd i. Note that w even is in Q 1 - . By 4 in lemma |3j), the inner product 
{w even , w odd ) is greater than or equal to minus a quantity going to zero faster than any power of lb- Thus, |(1 — U)w\ 2 
is greater than or equal to |io ei,en | 2 minus a quantity going to zero faster than any power of lb- 



Further, we write w 



odd 



Wi 



w 3 where wi = J2i=i 



5,9,... 



and w 3 = 



3,7,11,.. 



with m £ Mi. Note that by 



construction, each n, has projection at least (1/2 + ri)\rii\ 2 onto the span of spaces M[_ 2 , M-i, M+i- Further, this 
span of spaces A/?_ 2 , A/j_i,A/i+i is orthogonal to the span of Mj_ 2 ,Mj-i,Mj+i if \i — j\ > 4. Thus, the projection of 



wi onto the span of Mi over odd i and M- over even i is at least (1/2 + 77) | 2 , and so |(1 



U) Wl \ 2 > (1/2 + t?)K| 



and similarly |(1 - U)w 3 \ 2 > (1/2 + r))\w 3 \ 2 . Since (w 1 ,w 3 ) = 0, we have |(1 - U)(wx + w 3 )\ 2 > |(1 - U)wx\ 2 + 



\(l-U)w 3 \ 



2\{ Wl ,Uw 3 )\ > \(l-U) Wl \ 

|2 



1 



U)w 3 \ 2 - 2\U Wl \\Uw 3 \. Since 2\U Wl \\Uw 3 \ < \U Wl \ 2 + \Uw 3 \ 2 , 
|(l-C/)(u;i+u;3)| 2 > |(l-J7)«;i| 2 + |(l-[/)w 3 | 2 - \U Wl \ 2 ~ \Uw 3 \ 2 > 2r ? (jw 1 | 2 + |w 3 | 2 ) and so \(1-U)w\ 2 > 2 V \w\ 2 
minus a quantity going to zero superpolynomially in lb- Therefore, having upper bounded |J7io|, we have upper 
bounded (r,w), so Eq. (48) follows. 

For any v £ W, we can find Xi £ Xi for i — 0, n W i n — 1, such that v = J^- Xi 
v £ VV, we can find Xi, i = 0, n W i n — 1 with Xi £ Xi and v = J2i x i with 



Therefore, from Eq. (48), for any 



\v\ 2 > const, x {l/lbf 



E 

i=0 



(49) 



Second Property — We also claim that for any vector v in the space spanned by the Xi, such that v — Ax that 

\Pv -v\< const, x (y/F (l b )/lb)\x\. (50) 
To show Eq. (pOk, any vector x can be written as a linear combination of a vector in Q and a vector in x £ Q 1 - . 



Let x = J2i x i> witn x i e and J2i 



The vector x 1 - = (1 — U)x. Let x A 



J2i a j n ji w ith rij 



for j even and rij in Mi for j odd as in lemma 14k. We bound \Ax \ 2 by const, x (Fo(h)/lb) S 



M 3 
.1 < 



const, x {Fo(lb)/l 2 ) J2j \ n j\ — const, x (-FoG&V^JT^ \ x j\' 2 ^ where the last inequality uses the exponential decay on 
matrix element of G from lemma Q . 

Any vector v = Y^7=o™ 1 Xi with Xi £ X t can be written as v — Ax with \x\ 2 = X)™=o™ 1 \ x i\ 2 - Therefore, Eq. (50) 
implies that for any v = X)"=o" 1 x ii with Xi £ Xi, we have 



\Pv -v\< const, x (y/F (l b )/l b ) 



\ 



E 



(51) 



E. Verification of Claims 



We now verify the claims regarding the subspace W. 
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Proof of First Claim — To prove (1), note that for any vector v € B we have 

n win -l 



For any w e Vi, with \v\ = 1, wc can write v = Sx with |x| = 1, and then, from Eq. (30) 

n win -l 

\v- J2 n(l-Zi)x\ 2 < 2/L 2 . 

i=0 



(52) 



(53) 



The vector t,-(1 — Zi)x is in Xi. So, by Eq. (51), 

|(1 -P) n(l-Zi)x\ < const, x (y/F {l b )/l b ) t 



i=0 



\ Vi{l-Zi)a 



(54) 



< const, x {y / F (l b )/l b ) 

< const, x (y/F (l b )/l b ). 



\ i=0 



Combining Eqs. ( 53|54 ) with a triangle inequality verifies the first claim, given that F(L) is chosen to grow slower 
than any power of L. 

Proof of Second Claim — To prove the second claim (2), consider any vector v € W. We have v = 'YlnAxi, with 
Xl e Xi and U^Xi = sov = ^AUx,. So, \(l-P)Jv\ = \ Y, l ^~P)JAUx l \. We have | £\(1-P) JAC/a^ 2 = 

J2i j(JAU X i, (1 — P)JAUxj). Let project onto the fc-th block of the space 1Z. Note that (1 — P)AUxi = 0, so 
(1 - P)u}{il b )AUxi = 0, so 

\J2(l-P)JAU Xl \ 2 = \(l-P)J2(j-^lb))AUx t )\ 2 (55) 

i i 

< \ -u(ilb))AU Xi \ 2 

i 

< -u{il h ))AUx h (j-cjUhfjAUxj^. 

Using the decay in lemma Q, the inner product above decays exponentially in \i — j\, so we can sum over i,j to find 

(Jv, (1 - P)Jv) < const, x (4k) 2 \ x i\ 2 - ( 56 ) 



By Eqs. (|56j49j) , we have 



(l-P)Jw| 2 < const, x (l b K) 2 J2\ x *\ 2 

i 

(l/l 2 )(l bK )) 2 \v\ 

0) 2 M 2 , 



(57) 



< const, x 
= const, x 



verifying the second claim. 

Proof of Third Claim — As we established before, using the Lieb-Robinson bound, for the given choice of F(x) the 
norm of the projection of any vector y £ Xi onto Vl is bounded by |y| times a function decaying faster than any 
negative power of L. Let Pv L project onto Vl- Using Eq. (49), we find that the projection of any vector v £ W onto 
Vl is bounded by (writing v — X)"=o" w i with W; € Xi) 



\P Vl v\ < n wm 



(58) 



i=0 



< n wm max.i(\P VL Wi{ 2 /\wi{ 2 )(l/l b ) 2 \v\ 2 
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Since {\Pv L w j\ 2 l\ w j\ 2 ) 1S bounded by a function decaying faster than any negative power of L, this verifies the third 
claim. 

This completes the proof of Lemma ([2]) . After giving the error bounds in the next section, we explain some of the 
motivation behind the above construction, and comment on the easier case in which J is a tridiagonal matrix, rather 
than a block tridiagonal matrix. 



V. ERROR BOUNDS 

We finally give the error bounds to obtains theorems (Ipl. To obtain (J2j|, we pick 

n cut = A" 1 / 4 , (59) 



so that L — [(2/n cut )/A) — lj is of order 2/A 3 / 4 . Then, from lemma Q and Eq. (17 1, in the new basis the block-off- 
diagonal terms in H are bounded in operator norm by a constant times A 1 / 4 times a function growing slower than 
any power of 1/A. By Eq.J [18| , the difference between B and B' is bounded in operator norm by a constant times 
A 1 / 4 . Therefore, theorem (J^jTfollows. To obtain theorem |l]), we pick 

A = S^ 5 (60) 

in lemma ([!]). 

We omit the detailed analysis, but it is possible to choose E(x) to be a polylog as follows. We can pick T(x) 
to decay like exp(— x v ), for any r\ < 1[351 [3U]. Then we can pick F{L) to equal log(L) , for 9 > 1/rj, so that 
T(F(L)) ~ exp(-(log(L)) /'') decays faster than any power. 



VI. TRIDIAGONAL MATRICES 



In this section, we present tighter bounds for the case in which H is a tridiagonal matrix, rather than a block 
tridiagonal matrix. 

Remark: The difficulty we face is that the Xi are not orthogonal to each other. If they were orthogonal, then many 
of the estimates would be easier. Consider the case in which J is a block diagonal matrix, so that V\ is one dimensional. 
Let p(E) be a smoothed density of states at energy E: p(E) — tr(S^T(E, 1/L, 1/L, J)^ F(E, 1/L, 1/L, J)S). Suppose 
p{E) is such that it has a peak in the crossing points of Fig. 2a (the points where one function T is decreasing and the 
other is increasing and they cross) . Then, with the overlapping windows as shown, we find that most of the smoothed 
density of states lies in the overlap between the windows, rather than in the windows themselves. The overlap between 
the vectors in different windows is large. In the case of a tridiagonal matrix, we can combine two of the windows as 
shown in Fig. 2b to reduce the overlap of the normalized vectors; this general idea will motivate the construction in 
this section. 

We prove that 

Lemma 5. Let J be an L-by-L Hermitian tridiagonal matrix, with \\J\\ < 1 acting on a space B. Let Vj denote the 
vector with a 1 in the j-th entry and zeroes elsewhere. Then, there exists a space W which is a subspace of B with the 
following properties: 

(1) : The projection of Vi onto the orthogonal complement of W has norm bounded by £3 where £3 is equal to a 

constant times 1/L. 

(2) ; For any normalized vector w S W , the projection of Jw onto the orthogonal complement ofW has norm bounded 

by £4, where £4 is equal to 1/L times a function growing slower than any power of L. 

(3) ; The projection of onto W has norm bounded by e 5 , where £5 is a function decaying faster than any power of 

L. 

This lemma implies theorem ([3]): we construct A',B' as before, following steps (3) to construct the new basis, but 
because of the tighter bounds in lemma (JHJ we can choose n cu t = A" 1 / 2 when constructing the new basis. Now, in 
step (4), we find that A',B' are diagonal matrices, rather than just block diagonal matrices. 

For each i = 0, 1, n win — 1, define 

u(i) = -l + iK, (61) 



17 



as before. Define 



Set 



as before with 



Pi = tr('s' t J"(w(i),0,K, J)t^(w(i), 0,k,J)S) (62) 

2 

= J-(uj(i),0,K,J)vi . 

\nin = ^/{ n winL 2 ), (63) 

n wm = L/F(L) (64) 



as before. To prove Lemma (J5|, we use the following algorithm. There are n W i n windows, labeled 0, n W i n — 1. We 
label various windows as either "unmarked" or "marked" ; windows which are marked get marked by an integer label. 

1: Set i = 0. Initialize a real variable x to 0. Initialize an integer counter a to 1. Initialize all windows to unmarked. 

2: Set x to 0. If pi < X m i n , 

then 

2a: Increment i by one. 

2b: If i > n win , terminate. Otherwise, go to step 2. 
endif 

3: Mark window i with label a. 
4: Set x to x + pi. If x < 9pi, 
then 

4a: Increment i by one. 

4b: If i > n win , terminate. Otherwise, go to step 3. 
endif 

5: Increment a by one. Increment i by one. If i > n W i n , terminate. Otherwise, goto step 2. 

After running this algorithm, there will be a sequences of marked windows all marked with the same integer label a. 
There may be one or more unmarked windows separating the sequences of marked windows. In step 2, we scan along 
to find an i with pi > \ m in, and then in step 4 we mark a sequence of windows. We claim that the length of a sequence 
of marked windows is at most 1 + |~log 10 /g(2/A m m)] • This bound on the length of a sequence of marked windows 
holds because at the start of a sequence x is at least A m i„, x grows exponentially along the sequence (otherwise in 
step 4 we find that p i+ i > (l/9)x for some i), and x can be at most 2 since X^o"" 1 Pi — 2- 

Let the total number of sequences be n seq . Note that n seq < n win . 

For each sequence of windows marked with a given integer a, from window i to j, construct the vector y a given by 

j 

Va = }^(oj(k),0,K,J)vi. (65) 

k=i 

= ^((w(i) + w(j))/2, - w «) A «. 

The inner product (y a ,y a+ i) is equal to [IF(u}(J), 0, k, J)ui, y +i)- By Cauchy-Schwarz, this is bounded by 
\{F(uU)>°> K >> J)vi\\y a+ i\. To estimate KFfaij), 0, k, J)vi\, we use |(J"(w(j), 0, k, J)«i| 2 = p 3 - < X/LjPfc/ 9 < 
I J/a 1 2 / 9 — Yjl=i Sfe'=i(^ r ( w (^)' 0' K : ^)vi,-7 r (w(fc'), 0, k, J)vi)/9, where the first inequality is by construction and the 
second inequality follows from the fact that 0, k, J)v\, J r (w(fc'), 0, n, J)v\) > 0. Therefore, (y a ,y a +i) < 

(|S/a|/V9)|y +xl . so 



a | IVa+l I ■ 



(66) 
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We define W to be the space spanned by all such vectors y a , and we define P to project onto W. Consider any vector 
v e W, with 



v = ^2v a , (67) 



a=l 



with v a parallel to y a . By Eq. (66) 

n eeq 

M 2 > ~£k| 2 . (68) 



3 a., 



Remark: The function F((u}(i) + ui(j))/2, (oj(j) — u(i))/2, k,ui) is equal to unity for uj(i) < ui < ui(j). 
We now prove the Lemma ^ as follows: to prove the first claim, note that by construction, 



\P Vl ~ Vl \ 2 < \^y a -Vi\ 2 (69) 
< 2/L 2 . 

The second line of the above equation follows because the difference ^ a y a — v\ is equal to 
— Eiunmarked •^ ? ( a; (*)j 0, K, J)i>i, where the sum ranges over i such that the corresponding window is unmarked. 

To prove the second claim, consider the o-th sequence of marked windows, from window i to window j. Let 
u a = (u-(t)+u+{j))/2. Then, 

\{J - uj a )y a \ < I \\y a \ (70) 

which is bounded by 1/L times a function growing slower than any power of L. Therefore, 

Ki-p)^!^ 2 ^ 10 ^^^ 1 )!^! (7i) 



Using the bound Eq. (68 1, for any vector v £ W, 

|(1 - P) Jv\ < 2V3( 2+riOSl0/9(2/Amm)1 ) M, (72) 



which is bounded by 1/L times a function growing slower than any power of L, verifying the second claim. 
The proof of the third claim is identical to the previous case. 



VII. QUANTUM MEASUREMENT 
A. Construction and Results 



The constructions above can be applied to operators which arise in various physical quantum systems. For example, 
consider a quantum spin for a large spin S. Then, the operators S x /S and S y /S have operator norm 1 and have a 
commutator that is of order 1/S. Thus, we can find a basis in which both operators are almost diagonal. While it 
is well known that one can use a POVM (positive operator-valued measure) to approximately measure S x and S y at 
the same time, the existence of the given basis implies that one can approximately measure S x and S y simultaneously 
with a single projective measurement. Interestingly, while the operator S% is also almost diagonal in this basis (since 
it equals S(S + 1) — S% — Sy), it is not possible to find a basis in which S x , S y , and S z are all almost diagonal (this 
obstruction is similar to that in [5]). Therefore, to approximately measure S x ,S y , and S z simultaneously will require 
a POVM, rather than a projective measurement. 

For completeness, we now briefly show how to construct a POVM to approximately measure several almost commut- 
ing operators simultaneously. Consider any number N of Hcrmitian matrices, labeled A\, ...,An, with ||[Aj, Aj]\\ < 5 
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for all i,j and with ||^4i|| < 1 for all i. We now construct a POVM to approximately measure all N operators simul- 
taneously. The physical idea is very simple: we first do a "soft" measurement of An, then Ajy-i, and so on, until all 
operators are measured. 

Let n w i n be some integer given by 

n win =\5- 1 ' 2 {N-l)- 1 '^ (73) 
{n W i n will typically be much larger than unity). For i — 1, N and n = 0, n win — 1, define 

w(i) = -1 + 2i/{n win - 1) = -1 + in, (74) 
where k = 2/(n W i n — 1) as before, and define 

M(i,n) = y/T(Lj(n),0,K,Ai). (75) 

The definition of T is given at the start of section IV; we will see later that in this section that we do not actually 
need T to be infinitely diffcrentiable as it is defined there, but we have only weaker requirements on J- ' . Define 

0( ni ,n 2 ,...,n N ) = (M(l,n 1 )tM(2,n 2 ) t ...M(iV ! ri i v) t )(M(iV,n iV )...M(2,n 2 )M(l,n 1 )). (76) 

Then, 

n min -l 

^2 0(n 1: n 2 , ...,n N ) = 1, (77) 

711,712,-.. — 

and all of the operators 0(n\, n 2 , n^) are positive semidefinite by construction. Therefore, the operators 
0(ni, n 2 , tin) form a POVM. Note that M(i,rii) — M(i, rii)\ but we continue to write daggers on the opera- 
tors for clarity. 

We claim that this POVM approximately measures all operators simultaneously. That is, we will show that for any 
density matrix p, if the outcome of the measurement is n\ , n 2 , njq , then if we perform a subsequent measurement of 
any operator Ai, the outcome will be close to w(rii) with high probability. We show this by computing the expectation 
value {Ai — w(nj)) 2 averaged over all measurement outcomes. For any density matrix /?, for any i, the average over 
all outcomes of (Ai — u>(rii)) 2 is equal to 

J2 tr((A,- W (n l )) 2 Af(l,n 1 )M(2,n 2 )...p...M(2,n 2 )tM(l,n 1 )t) (78) 

711,712,... — 

The main result in this section is that 

n win -l 

tT((A i -u(n i )) 2 M(l,n 1 )M(2,n 2 )...p..M(2,n 2 yM(l,n 1 )^ < const, x (V - 1)6. (79) 

7li ,712 ,. . . — 

We show this in the next subsection. 

B. Bounds 

Note that || J2 ni M{i,ni)\\ < y/2. To bound Eq. (78), we need three results, Eqs. (80 8l|82 ) below. First, 



n win -l 



^2 \\(A-u(ni))M(i,ni)\\ < const, x k (80) 



71i=0 



< const, x \jn u 



Second, we need 



|| J2 [ M 0>j)' - ^i))]OM(j,n^\\ < const, x (*/«)||0|| (81) 



Tl4= 
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for any operator O. 
Third, we need 



|| J2 [Mij^^Ai-ujimmiMU^^Ai-LjiniM < const, x (^) 2 ||0|| 



n; —0 



for any operator O. 



Eq. (80) follows immediately from the support of J 7 . To show Eq. (81), define 



A° = k diexp(iA,i)(A - w(raj)) exp(-iAjt)f(nt), 



(82) 



(83) 

where the function f(t) is defined to have the Fourier transform as in Eq. Then, ||A° — (A}— w(rij))|| < const. xS/k 
as in lemma ([lj. Also, if Vi,V2 are eigenvectors of Aj with corresponding eigenvalues x±,X2 with \x\ — x%\ > K, then 
(ui, A°V2) — 0, which implies that 



n win -l 



[M(j, nj ),A°}OM(j, n 3 y\\ < 2 max „. (|| [M(j, n,), 4°]OM(j, n 



(84) 



Eq. (84) is the reason for introducing the operator A . We can bound the commutator [M(j, rij), A ] as follows. Note 
thatfpL^A ]!! < S. Write 



M(j,n,j) = / dt exp(iAjt)^J ' F{u)(rij), 0, k, t), 



(85) 



where \JF(-, ., ., denotes the Fourier transform of the square-root of T . Then since || [exp(iA,t), A ] \\ < const, x |i|<5, 
we can use a triangle inequality to show that 



\\[M(j, nj ),A ]\\ < / dt ^F{Lo{ n] ),0,K,t)\t\5. 



(86) 



Then, since v-^ r (., ., .,w) is infinitely diffcrentiable, the Fourier transform deca ys f aster than any powe r of t and the 
integral over t converges, so we have || [M(j, rij), A ] \\ < const, x S/k. Using Eq. (84) gives Eq. (81). Eq. ( 82 ) is derived 
similarly. 



Using Eqs. ( 80|8l|82 ), we can bound the sum in Eq. (78) by writing (A{ — w(rii)) 2 = (A — a;(nj))(A, — w(nj)), 
and commuting one of the terms (A — w(rij)) to the right through M(j,rij) for j < i until it hits the M(i,m) and 
commuting the other term (A — w ( n i)) to the left through M(j, rijy for j < i until it hits M(i, riiy. Therefore, 



tr((A -^(n i )) 2 M(l,n 1 )M(2,n 2 )...p...Af(2,n 2 )tM(l,n 1 ))t 

ni,n2,...— 

< const, x ^(i - l) 2 (5 2 n^„ + (i - l)Sn win /n wm + l/n 2 wir ^j 

< const, x ((JV - l) a *X iB + (TV - 1)<5 + 1/n 2 ^) . 



(87) 



The first term on the right-hand side of Eq. ( 87 ) arises from two non- vanishing commutators (if the non-vanis hing 
commutators are with M(j,n,j) and M(fe, nfc)Ufor j ^ k then we use Eq. (81) twice, but if j = k we use Eq. (82) 



once). The second term arises from one non- vanishing commutator and one use of Eq. (80), and the last term arises 



from using Eq. (80) twice. Choosing 



n win =\5- l ' 2 {N-l)- l '% 

we find that we measure all operators to within a mean-square error of order (N — 1)6, as claimed. 

Note that we did not actually require that J-(., ., ., uS) be infinitely differentiable in this sec tion. We only required 



that the Fourier transform ., ., t) decay sufficiently rapidly in t that the integral (86) converges. The other 

properties of T we used are that ^2 n J-(u)(n),0, k,uj) = 1 for —1 < u < 1 and that J-~(oj(n), 0, k, uS) vanish for 
\lj — oj(n)\ > k. 
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VIII. DISCUSSION 



The main result is an explicit construction of a pair of exactly commuting matrices which are close to a pair of almost 
commuting matrices. The construction of the matrix is explicit and can be handled easily on a computer for modest 
N. We have in fact implemented the construction in Lemma ^ for the uniform chain. In practical applications, we 
expect that, for many tridiagonal matrices, the lack of orthogonality of the Xi will not cause a problem, and choosing 
W to be the space spanned by the Xi will lead to satisfactory results, without having to follow the more complicated 
procedure above. If, for some particular J, the lack of orthogonality of the Xi does cause a problem, an alternative 
procedure that might be more useful in practice than the deterministic procedure above is to add small, randomly 
chosen matrices to each diagonal block of J. This may smooth out the spectrum of J and then allow one to choose 
W to be the space spanned by the Xi. 

We gave above applications to quantum measurement. Another application of this result is to construct Wannier 
functions for any two dimensional quantum system for a spectral gap. In [31 j . it was pointed out that given a two 
dimensional quantum system with a gap between bands, one could define an operator G which projected onto the 
bands below the gap. Then, define the operator X and Y to measure X and Y position of particles, and define 
GXG and GYG as projections of X and Y into the lowest band. Let ||X||, ||Y|| = L, where L is the linear size of 
the system. Since the operator G was constructed in [3T] as a short-range operator, the commutator || [GXG, GYG] || 
is small compared to L 2 , and thus we can use the results here to construct a basis of Wannier functions which is 
localized in both the x- and y-directions. 
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